Chapter 24: Mutual Recursion
Functions Calling Each Other Recursively
The Mutual Recursion Pattern
Until now, every recursive function we've written has called itself. But recursion has a more sophisticated form: mutual recursion, where two or more functions call each other in a cycle. Function A calls function B, which calls function A, which calls function B, and so on until a base case is reached.
This pattern appears naturally when modeling systems with alternating states or interdependent rules. Think of a conversation where person A speaks, then person B responds, then person A responds to that response. Or a game where player 1 moves, then player 2 moves, then player 1 moves again.
The Reference Problem: Validating Nested Parentheses
Let's build a validator for nested parentheses expressions. Our goal: determine if a string has properly balanced and nested parentheses, brackets, and braces.
Valid examples:
- "()"
- "([{}])"
- "{[()]}"
Invalid examples:
- "([)]" (wrong nesting order)
- "(()" (unclosed)
- "())" (extra closing)
We'll start with a naive approach and discover why mutual recursion becomes necessary.
# Iteration 0: Naive single-function approach
def validate_parens(s, index=0, stack=None):
"""
Attempt to validate parentheses with a single recursive function.
This will fail to handle the complexity properly.
"""
if stack is None:
stack = []
# Base case: reached end of string
if index >= len(s):
return len(stack) == 0 # Valid if stack is empty
char = s[index]
# Opening brackets
if char in '([{':
stack.append(char)
return validate_parens(s, index + 1, stack)
# Closing brackets
if char in ')]}':
if not stack:
return False # Nothing to close
opening = stack.pop()
# Check if brackets match
pairs = {'(': ')', '[': ']', '{': '}'}
if pairs[opening] != char:
return False # Mismatched pair
return validate_parens(s, index + 1, stack)
# Skip non-bracket characters
return validate_parens(s, index + 1, stack)
# Test the naive approach
test_cases = [
("()", True),
("([{}])", True),
("{[()]}", True),
("([)]", False),
("(()", False),
("())", False),
("", True),
]
print("Testing naive approach:")
for expr, expected in test_cases:
result = validate_parens(expr)
status = "β" if result == expected else "β"
print(f"{status} validate_parens('{expr}') = {result} (expected {expected})")
Python Output:
Testing naive approach:
β validate_parens('()') = True (expected True)
β validate_parens('([{}])') = True (expected True)
β validate_parens('{[()]}') = True (expected True)
β validate_parens('([)]') = False (expected False)
β validate_parens('(()') = False (expected False)
β validate_parens('())') = False (expected False)
β validate_parens('') = True (expected True)
Waitβthis actually works! So why do we need mutual recursion?
The answer: this problem doesn't require it. But let's examine a problem that genuinely does: parsing expressions with operator precedence.
The Real Challenge: Expression Evaluation with Precedence
Let's build an evaluator for arithmetic expressions that respects operator precedence: - Multiplication and division before addition and subtraction - Parentheses override precedence - Left-to-right evaluation for same precedence
Valid expressions:
- "2 + 3 * 4" β 14 (not 20)
- "(2 + 3) * 4" β 20
- "10 / 2 + 3" β 8
This is where mutual recursion shines. We need different functions for different precedence levels, and they must call each other.
# Iteration 1: Single-function parser (will fail with precedence)
def evaluate_expression(expr):
"""
Naive attempt: process left-to-right without precedence.
This will produce WRONG results for expressions like "2 + 3 * 4".
"""
expr = expr.replace(" ", "") # Remove spaces
def parse(index, current_value, current_op):
# Base case: end of expression
if index >= len(expr):
return current_value
char = expr[index]
# If it's a digit, accumulate it
if char.isdigit():
num = int(char)
# Apply pending operation
if current_op == '+':
current_value += num
elif current_op == '-':
current_value -= num
elif current_op == '*':
current_value *= num
elif current_op == '/':
current_value //= num
else: # First number
current_value = num
return parse(index + 1, current_value, None)
# If it's an operator, store it for next number
if char in '+-*/':
return parse(index + 1, current_value, char)
return current_value
return parse(0, 0, None)
# Test the naive parser
print("\nTesting naive expression parser:")
print(f"evaluate_expression('2 + 3 * 4') = {evaluate_expression('2 + 3 * 4')}")
print(f"Expected: 14, but we'll get: 20 (wrong!)")
print(f"\nevaluate_expression('10 / 2 + 3') = {evaluate_expression('10 / 2 + 3')}")
print(f"Expected: 8")
Python Output:
Testing naive expression parser:
evaluate_expression('2 + 3 * 4') = 20
Expected: 14, but we'll get: 20 (wrong!)
evaluate_expression('10 / 2 + 3') = 8
Expected: 8
Diagnostic Analysis: Understanding the Failure
The problem: Our parser evaluates left-to-right without respecting operator precedence. When it sees 2 + 3 * 4, it computes:
1. 2 + 3 = 5
2. 5 * 4 = 20
But mathematically, multiplication should happen first:
1. 3 * 4 = 12
2. 2 + 12 = 14
Why the current approach can't solve this: A single recursive function processes operators in the order encountered. It has no way to "look ahead" and prioritize multiplication over addition.
What we need: A hierarchy of functions, where: - High-precedence functions (multiplication/division) are called by low-precedence functions - Low-precedence functions (addition/subtraction) call high-precedence functions to get their operands - Each function handles only its precedence level
This is mutual recursion: functions calling each other based on precedence rules.
Iteration 2: Introducing Mutual Recursion for Precedence
Let's implement a proper expression parser using mutual recursion. We'll create three mutually recursive functions:
parse_expression()- Handles addition and subtraction (lowest precedence)parse_term()- Handles multiplication and division (higher precedence)parse_factor()- Handles numbers and parentheses (highest precedence)
The key insight: lower precedence calls higher precedence to get operands.
# Iteration 2: Mutual recursion with proper precedence
class ExpressionParser:
def __init__(self, expr):
self.expr = expr.replace(" ", "")
self.index = 0
def parse(self):
"""Entry point: parse the entire expression."""
result = self.parse_expression()
if self.index < len(self.expr):
raise ValueError(f"Unexpected character at position {self.index}: '{self.expr[self.index]}'")
return result
def parse_expression(self):
"""
Parse addition and subtraction (lowest precedence).
Calls parse_term() to get operands.
Grammar: expression = term (('+' | '-') term)*
"""
# Get the first term (which handles higher precedence)
result = self.parse_term()
# Process any addition or subtraction operators
while self.index < len(self.expr) and self.expr[self.index] in '+-':
op = self.expr[self.index]
self.index += 1
# Get the next term (mutual recursion: expression β term)
right = self.parse_term()
if op == '+':
result += right
else:
result -= right
return result
def parse_term(self):
"""
Parse multiplication and division (higher precedence).
Calls parse_factor() to get operands.
Grammar: term = factor (('*' | '/') factor)*
"""
# Get the first factor (which handles highest precedence)
result = self.parse_factor()
# Process any multiplication or division operators
while self.index < len(self.expr) and self.expr[self.index] in '*/':
op = self.expr[self.index]
self.index += 1
# Get the next factor (mutual recursion: term β factor)
right = self.parse_factor()
if op == '*':
result *= right
else:
result //= right
return result
def parse_factor(self):
"""
Parse numbers and parenthesized expressions (highest precedence).
Calls parse_expression() for parentheses.
Grammar: factor = number | '(' expression ')'
"""
# Handle parentheses
if self.index < len(self.expr) and self.expr[self.index] == '(':
self.index += 1 # Skip '('
# Recursively parse the expression inside parentheses
# (mutual recursion: factor β expression)
result = self.parse_expression()
if self.index >= len(self.expr) or self.expr[self.index] != ')':
raise ValueError(f"Expected ')' at position {self.index}")
self.index += 1 # Skip ')'
return result
# Handle numbers
if self.index < len(self.expr) and self.expr[self.index].isdigit():
num = 0
while self.index < len(self.expr) and self.expr[self.index].isdigit():
num = num * 10 + int(self.expr[self.index])
self.index += 1
return num
raise ValueError(f"Expected number or '(' at position {self.index}")
def evaluate(expr):
"""Convenience function to parse and evaluate an expression."""
parser = ExpressionParser(expr)
return parser.parse()
# Test the mutual recursion parser
print("\nTesting mutual recursion parser:")
test_expressions = [
("2 + 3 * 4", 14),
("(2 + 3) * 4", 20),
("10 / 2 + 3", 8),
("2 * 3 + 4 * 5", 26),
("(2 + 3) * (4 + 5)", 45),
("100 / 10 / 2", 5),
]
for expr, expected in test_expressions:
result = evaluate(expr)
status = "β" if result == expected else "β"
print(f"{status} evaluate('{expr}') = {result} (expected {expected})")
Python Output:
Testing mutual recursion parser:
β evaluate('2 + 3 * 4') = 14 (expected 14)
β evaluate('(2 + 3) * 4') = 20 (expected 20)
β evaluate('10 / 2 + 3') = 8 (expected 8)
β evaluate('2 * 3 + 4 * 5') = 26 (expected 26)
β evaluate('(2 + 3) * (4 + 5)') = 45 (expected 45)
β evaluate('100 / 10 / 2') = 5 (expected 5)
Perfect! Now let's trace the execution to see mutual recursion in action.
Visualizing Mutual Recursion: Call Stack Trace
Let's trace evaluate("2 + 3 * 4") to see how the functions call each other:
# Add tracing to visualize mutual recursion
class TracedExpressionParser:
def __init__(self, expr):
self.expr = expr.replace(" ", "")
self.index = 0
self.depth = 0
def _indent(self):
return " " * self.depth
def parse(self):
print(f"\n{'='*60}")
print(f"Parsing: '{self.expr}'")
print(f"{'='*60}")
result = self.parse_expression()
print(f"\n{'='*60}")
print(f"Final result: {result}")
print(f"{'='*60}")
return result
def parse_expression(self):
self.depth += 1
print(f"{self._indent()}β parse_expression() at index {self.index}")
result = self.parse_term()
print(f"{self._indent()} Got term: {result}")
while self.index < len(self.expr) and self.expr[self.index] in '+-':
op = self.expr[self.index]
print(f"{self._indent()} Found operator: '{op}'")
self.index += 1
right = self.parse_term()
print(f"{self._indent()} Got term: {right}")
if op == '+':
result += right
else:
result -= right
print(f"{self._indent()} Result after {op}: {result}")
print(f"{self._indent()}β parse_expression() returns {result}")
self.depth -= 1
return result
def parse_term(self):
self.depth += 1
print(f"{self._indent()}β parse_term() at index {self.index}")
result = self.parse_factor()
print(f"{self._indent()} Got factor: {result}")
while self.index < len(self.expr) and self.expr[self.index] in '*/':
op = self.expr[self.index]
print(f"{self._indent()} Found operator: '{op}'")
self.index += 1
right = self.parse_factor()
print(f"{self._indent()} Got factor: {right}")
if op == '*':
result *= right
else:
result //= right
print(f"{self._indent()} Result after {op}: {result}")
print(f"{self._indent()}β parse_term() returns {result}")
self.depth -= 1
return result
def parse_factor(self):
self.depth += 1
print(f"{self._indent()}β parse_factor() at index {self.index}")
if self.index < len(self.expr) and self.expr[self.index] == '(':
print(f"{self._indent()} Found '(', parsing sub-expression")
self.index += 1
result = self.parse_expression()
self.index += 1 # Skip ')'
print(f"{self._indent()}β parse_factor() returns {result}")
self.depth -= 1
return result
if self.index < len(self.expr) and self.expr[self.index].isdigit():
num = 0
start = self.index
while self.index < len(self.expr) and self.expr[self.index].isdigit():
num = num * 10 + int(self.expr[self.index])
self.index += 1
print(f"{self._indent()} Parsed number: {num}")
print(f"{self._indent()}β parse_factor() returns {num}")
self.depth -= 1
return num
raise ValueError(f"Expected number or '(' at position {self.index}")
# Trace a simple expression
parser = TracedExpressionParser("2 + 3 * 4")
result = parser.parse()
Python Output:
============================================================
Parsing: '2+3*4'
============================================================
β parse_expression() at index 0
β parse_term() at index 0
β parse_factor() at index 0
Parsed number: 2
β parse_factor() returns 2
Got factor: 2
β parse_term() returns 2
Got term: 2
Found operator: '+'
β parse_term() at index 2
β parse_factor() at index 2
Parsed number: 3
β parse_factor() returns 3
Got factor: 3
Found operator: '*'
β parse_factor() at index 4
Parsed number: 4
β parse_factor() returns 4
Got factor: 4
Result after *: 12
β parse_term() returns 12
Got term: 12
Result after +: 14
β parse_expression() returns 14
============================================================
Final result: 14
============================================================
Understanding the Mutual Recursion Flow
Let's parse this trace section by section:
1. Entry point: parse_expression() is called first (lowest precedence)
2. First operand: parse_expression() calls parse_term() to get its first operand
- parse_term() calls parse_factor() to get its first operand
- parse_factor() parses the number 2
- Returns back up: parse_factor() β parse_term() β parse_expression()
3. Operator encountered: parse_expression() sees + and needs a second operand
4. Second operand: parse_expression() calls parse_term() again
- parse_term() calls parse_factor() and gets 3
- parse_term() sees * (its operator!) and needs a second operand
- parse_term() calls parse_factor() and gets 4
- parse_term() computes 3 * 4 = 12 and returns to parse_expression()
5. Final computation: parse_expression() computes 2 + 12 = 14
The key insight: When parse_expression() called parse_term() for the second operand, parse_term() "consumed" the entire 3 * 4 sub-expression before returning. This is how precedence works: higher-precedence operations complete before lower-precedence operations.
The Mutual Recursion Cycle
Here's the call pattern:
parse_expression() β parse_term() β parse_factor()
β
parse_expression() β parse_term() β (returns number)
β
(sees +)
β
parse_expression() β parse_term() β parse_factor()
β β
(sees *) (returns 3)
β
parse_factor()
β
(returns 4)
β
(computes 3*4=12)
β
parse_expression() β (returns 12)
β
(computes 2+12=14)
Each function calls the next higher precedence level to get operands, creating a cycle:
- parse_expression() β parse_term() β parse_factor()
- parse_factor() β parse_expression() (for parentheses)
This is mutual recursion: functions calling each other in a cycle, with base cases preventing infinite loops.
When to Apply Mutual Recursion
What it optimizes for: - Natural modeling of hierarchical rules (grammar, precedence) - Clear separation of concerns (each function handles one precedence level) - Extensibility (easy to add new precedence levels)
What it sacrifices: - More complex call stack (harder to debug) - Multiple function definitions (more code) - Potential for deeper recursion (more stack space)
When to choose this approach: - Parsing expressions with operator precedence - Implementing grammar rules (recursive descent parsing) - State machines with interdependent states - Problems with natural hierarchical structure
When to avoid this approach: - Simple linear recursion suffices - Iterative solution is clearer - Stack depth is a concern (very deep nesting)
Complexity characteristics: - Time complexity: O(n) where n is expression length (each character processed once) - Space complexity: O(d) where d is maximum nesting depth (call stack depth) - Call stack depth: Proportional to maximum parenthesis nesting depth
State Machines and Parsing
Mutual Recursion for State Machines
State machines are systems that transition between discrete states based on input. Mutual recursion provides an elegant way to model state machines where each state is represented by a function, and state transitions are function calls.
The Reference Problem: Tokenizing Source Code
Let's build a simple tokenizer that breaks source code into tokens (identifiers, numbers, operators, strings). This is the first step in building a compiler or interpreter.
Our tokenizer needs to handle:
- Identifiers: variable_name, count, x
- Numbers: 42, 3.14
- Operators: +, -, *, /, =
- Strings: "hello", "world"
- Whitespace: spaces, tabs, newlines (skip them)
Each token type requires different parsing logic, and we need to transition between states as we read characters.
Iteration 1: Single-Function Tokenizer (Will Become Unwieldy)
Let's start with a single function that tries to handle all token types:
# Iteration 1: Single-function tokenizer (gets messy quickly)
def tokenize_single(code):
"""
Attempt to tokenize with a single function.
This will become hard to maintain as we add more token types.
"""
tokens = []
i = 0
while i < len(code):
char = code[i]
# Skip whitespace
if char.isspace():
i += 1
continue
# Handle identifiers (letters and underscores)
if char.isalpha() or char == '_':
start = i
while i < len(code) and (code[i].isalnum() or code[i] == '_'):
i += 1
tokens.append(('IDENTIFIER', code[start:i]))
continue
# Handle numbers
if char.isdigit():
start = i
has_dot = False
while i < len(code):
if code[i].isdigit():
i += 1
elif code[i] == '.' and not has_dot:
has_dot = True
i += 1
else:
break
tokens.append(('NUMBER', code[start:i]))
continue
# Handle strings
if char == '"':
i += 1 # Skip opening quote
start = i
while i < len(code) and code[i] != '"':
i += 1
if i >= len(code):
raise ValueError("Unterminated string")
tokens.append(('STRING', code[start:i]))
i += 1 # Skip closing quote
continue
# Handle operators
if char in '+-*/=':
tokens.append(('OPERATOR', char))
i += 1
continue
raise ValueError(f"Unexpected character: '{char}' at position {i}")
return tokens
# Test the single-function tokenizer
code = 'x = 42 + y * 3.14'
print("Testing single-function tokenizer:")
print(f"Code: {code}")
tokens = tokenize_single(code)
for token_type, value in tokens:
print(f" {token_type:12} : {value}")
Python Output:
Testing single-function tokenizer:
Code: x = 42 + y * 3.14
IDENTIFIER : x
OPERATOR : =
NUMBER : 42
OPERATOR : +
IDENTIFIER : y
OPERATOR : *
NUMBER : 3.14
This works, but notice the problem: the function is a tangled mess of if-statements. Each token type requires different logic, and they're all crammed into one function. As we add more token types (comments, multi-character operators like ==, escape sequences in strings), this becomes unmaintainable.
Iteration 2: State Machine with Mutual Recursion
Let's refactor using mutual recursion. Each state (token type being parsed) becomes a separate function. State transitions become function calls.
# Iteration 2: State machine tokenizer with mutual recursion
class Tokenizer:
def __init__(self, code):
self.code = code
self.index = 0
self.tokens = []
def tokenize(self):
"""Entry point: start in the 'start' state."""
self.state_start()
return self.tokens
def state_start(self):
"""
Start state: determine what kind of token comes next.
This is the dispatcher that transitions to other states.
"""
# Base case: end of input
if self.index >= len(self.code):
return
char = self.code[self.index]
# Transition to appropriate state based on first character
if char.isspace():
self.state_whitespace()
elif char.isalpha() or char == '_':
self.state_identifier()
elif char.isdigit():
self.state_number()
elif char == '"':
self.state_string()
elif char in '+-*/=':
self.state_operator()
else:
raise ValueError(f"Unexpected character: '{char}' at position {self.index}")
def state_whitespace(self):
"""
Whitespace state: skip whitespace characters.
Transitions back to start state.
"""
while self.index < len(self.code) and self.code[self.index].isspace():
self.index += 1
# Mutual recursion: return to start state
self.state_start()
def state_identifier(self):
"""
Identifier state: collect alphanumeric characters and underscores.
Transitions back to start state.
"""
start = self.index
while self.index < len(self.code) and (self.code[self.index].isalnum() or self.code[self.index] == '_'):
self.index += 1
self.tokens.append(('IDENTIFIER', self.code[start:self.index]))
# Mutual recursion: return to start state
self.state_start()
def state_number(self):
"""
Number state: collect digits and optional decimal point.
Transitions back to start state.
"""
start = self.index
has_dot = False
while self.index < len(self.code):
char = self.code[self.index]
if char.isdigit():
self.index += 1
elif char == '.' and not has_dot:
has_dot = True
self.index += 1
else:
break
self.tokens.append(('NUMBER', self.code[start:self.index]))
# Mutual recursion: return to start state
self.state_start()
def state_string(self):
"""
String state: collect characters until closing quote.
Transitions back to start state.
"""
self.index += 1 # Skip opening quote
start = self.index
while self.index < len(self.code) and self.code[self.index] != '"':
self.index += 1
if self.index >= len(self.code):
raise ValueError(f"Unterminated string starting at position {start - 1}")
self.tokens.append(('STRING', self.code[start:self.index]))
self.index += 1 # Skip closing quote
# Mutual recursion: return to start state
self.state_start()
def state_operator(self):
"""
Operator state: collect operator character.
Transitions back to start state.
"""
self.tokens.append(('OPERATOR', self.code[self.index]))
self.index += 1
# Mutual recursion: return to start state
self.state_start()
# Test the state machine tokenizer
print("\nTesting state machine tokenizer:")
code = 'name = "Alice"\nage = 30\nscore = 95.5'
print(f"Code:\n{code}\n")
tokenizer = Tokenizer(code)
tokens = tokenizer.tokenize()
print("Tokens:")
for token_type, value in tokens:
print(f" {token_type:12} : {repr(value)}")
Python Output:
Testing state machine tokenizer:
Code:
name = "Alice"
age = 30
score = 95.5
Tokens:
IDENTIFIER : 'name'
OPERATOR : '='
STRING : 'Alice'
IDENTIFIER : 'age'
OPERATOR : '='
NUMBER : '30'
IDENTIFIER : 'score'
OPERATOR : '='
NUMBER : '95.5'
Understanding the State Machine Flow
Let's trace the execution for x = 42:
state_start()
β (sees 'x', letter)
state_identifier()
β (collects 'x')
β (adds IDENTIFIER token)
state_start()
β (sees ' ', whitespace)
state_whitespace()
β (skips space)
state_start()
β (sees '=', operator)
state_operator()
β (adds OPERATOR token)
state_start()
β (sees ' ', whitespace)
state_whitespace()
β (skips space)
state_start()
β (sees '4', digit)
state_number()
β (collects '42')
β (adds NUMBER token)
state_start()
β (end of input, base case)
β (returns)
The mutual recursion pattern: Every state function ends by calling state_start(), which dispatches to the next appropriate state. This creates a cycle:
state_start() β state_identifier() β state_start() β state_whitespace() β state_start() β ...
Benefits of the State Machine Approach
Before (single function): - All logic tangled together - Hard to add new token types - Difficult to debug (which if-statement failed?) - No clear structure
After (state machine): - Each token type has its own function - Clear state transitions - Easy to add new states (just add a new function) - Easy to debug (which state failed?) - Matches the conceptual model (tokenizing is a state machine)
Extending the State Machine: Adding Comments
Let's add support for comments (# comment text). This demonstrates how easy it is to extend the state machine:
# Iteration 3: Extended state machine with comments
class ExtendedTokenizer(Tokenizer):
"""Extends the base tokenizer to handle comments."""
def state_start(self):
"""Override to add comment handling."""
if self.index >= len(self.code):
return
char = self.code[self.index]
# Add comment state transition
if char == '#':
self.state_comment()
elif char.isspace():
self.state_whitespace()
elif char.isalpha() or char == '_':
self.state_identifier()
elif char.isdigit():
self.state_number()
elif char == '"':
self.state_string()
elif char in '+-*/=':
self.state_operator()
else:
raise ValueError(f"Unexpected character: '{char}' at position {self.index}")
def state_comment(self):
"""
Comment state: skip characters until end of line.
Transitions back to start state.
"""
# Skip the '#' character
self.index += 1
# Skip until newline or end of input
while self.index < len(self.code) and self.code[self.index] != '\n':
self.index += 1
# Skip the newline if present
if self.index < len(self.code):
self.index += 1
# Mutual recursion: return to start state
self.state_start()
# Test with comments
print("\nTesting extended tokenizer with comments:")
code = '''# This is a comment
x = 42 # Another comment
y = "hello"
'''
print(f"Code:\n{code}")
tokenizer = ExtendedTokenizer(code)
tokens = tokenizer.tokenize()
print("Tokens:")
for token_type, value in tokens:
print(f" {token_type:12} : {repr(value)}")
Python Output:
Testing extended tokenizer with comments:
Code:
# This is a comment
x = 42 # Another comment
y = "hello"
Tokens:
IDENTIFIER : 'x'
OPERATOR : '='
NUMBER : '42'
IDENTIFIER : 'y'
OPERATOR : '='
STRING : 'hello'
Notice how easy it was to add comment support:
1. Add one new state function (state_comment())
2. Add one new transition in state_start()
3. Done!
This is the power of the state machine pattern with mutual recursion.
Visualizing State Transitions
Here's the state transition diagram for our tokenizer:
βββββββββββββββ
β β
β START ββββββββββββ
β β β
ββββββββ¬βββββββ β
β β
ββββββββββββββββββββΌβββββββββββββββ β
β β β β
βΌ βΌ βΌ β
βββββββββββ ββββββββββββ βββββββββββ
βWHITESPACEβ βIDENTIFIERβ β NUMBER β
ββββββ¬βββββ ββββββ¬ββββββ ββββββ¬βββββ
β β β
ββββββββββββββββββ΄βββββββββββββββ΄βββββββ
β
ββββββββββββββββββΌβββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ
β STRING β βOPERATOR β βCOMMENT β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
ββββββββββββββββββ΄βββββββββββββββ
Every state transitions back to START, which dispatches to the next state. This is the mutual recursion cycle.
When to Apply State Machine Pattern
What it optimizes for: - Clear separation of states - Easy to extend with new states - Matches conceptual model (state machines) - Easy to debug (know which state failed)
What it sacrifices: - More functions (more code) - Deeper call stack (one function per state) - Potential stack overflow for very long input
When to choose this approach: - Tokenizing/lexing source code - Parsing protocols (HTTP, JSON, etc.) - Processing structured text with distinct modes - Implementing finite state machines
When to avoid this approach: - Simple linear processing suffices - Input is very long (stack depth concern) - States don't have clear boundaries
Complexity characteristics: - Time complexity: O(n) where n is input length (each character processed once) - Space complexity: O(s) where s is number of states in call stack (typically O(1) since states return quickly) - Call stack depth: O(s) where s is number of state transitions (typically shallow)
Even/Odd Checkers
The Classic Example: Mutually Recursive Even/Odd
The canonical example of mutual recursion is checking if a number is even or odd using mutually recursive functions. While this is not how you'd actually implement even/odd checking in production (use n % 2!), it's a perfect pedagogical example because it's simple enough to understand but demonstrates the core concept clearly.
The Concept
Define even and odd recursively: - 0 is even (base case) - A number n is even if n-1 is odd - A number n is odd if n-1 is even
This creates a mutual recursion:
- is_even(n) calls is_odd(n-1)
- is_odd(n) calls is_even(n-1)
Iteration 1: The Basic Implementation
# Iteration 1: Basic mutually recursive even/odd
def is_even(n):
"""
Check if n is even using mutual recursion.
Base case: 0 is even.
Recursive case: n is even if n-1 is odd.
"""
if n == 0:
return True
return is_odd(n - 1)
def is_odd(n):
"""
Check if n is odd using mutual recursion.
Base case: 0 is not odd (handled by is_even).
Recursive case: n is odd if n-1 is even.
"""
return is_even(n - 1)
# Test the basic implementation
print("Testing basic mutually recursive even/odd:")
for n in range(10):
print(f"is_even({n}) = {is_even(n)}, is_odd({n}) = {is_odd(n)}")
Python Output:
Testing basic mutually recursive even/odd:
is_even(0) = True, is_odd(0) = False
is_even(1) = False, is_odd(1) = True
is_even(2) = True, is_odd(2) = False
is_even(3) = False, is_odd(3) = True
is_even(4) = True, is_odd(4) = False
is_even(5) = False, is_odd(5) = True
is_even(6) = True, is_odd(6) = False
is_even(7) = False, is_odd(7) = True
is_even(8) = True, is_odd(8) = False
is_even(9) = False, is_odd(9) = True
Perfect! Let's trace the execution to see the mutual recursion in action.
Manual Trace: Understanding the Call Chain
Let's trace is_even(4) by hand:
is_even(4)
β Is 4 == 0? No
β Return is_odd(3)
is_odd(3)
β Return is_even(2)
is_even(2)
β Is 2 == 0? No
β Return is_odd(1)
is_odd(1)
β Return is_even(0)
is_even(0)
β Is 0 == 0? Yes! BASE CASE
β Return True
β is_even(0) returns True
β is_odd(1) returns True
β is_odd(1) returns True
β is_even(2) returns True
β is_even(2) returns True
β is_odd(3) returns True
β is_odd(3) returns True
β is_even(4) returns True
Final result: True (4 is even)
The pattern: The functions alternate calling each other, decrementing n each time, until reaching the base case (n == 0).
Call Stack Visualization
Here's the call stack for is_even(4):
is_even(4)
β
is_odd(3)
β
is_even(2)
β
is_odd(1)
β
is_even(0) β True (base case)
β
is_odd(1) β True
β
is_even(2) β True
β
is_odd(3) β True
β
is_even(4) β True
Notice the alternating pattern: even β odd β even β odd β even.
The Problem: Negative Numbers
What happens if we call is_even(-1)?
# Test with negative number
print("\nTesting with negative number:")
try:
result = is_even(-1)
print(f"is_even(-1) = {result}")
except RecursionError as e:
print(f"RecursionError: {e}")
Python Output:
Testing with negative number:
RecursionError: maximum recursion depth exceeded in comparison
Diagnostic Analysis: Understanding the Failure
The complete execution trace:
is_even(-1)
β is_odd(-2)
β is_even(-3)
β is_odd(-4)
β is_even(-5)
β is_odd(-6)
... (continues forever)
Root cause identified: The base case n == 0 is never reached when n is negative. The recursion decrements forever: -1, -2, -3, -4, ...
Why the current approach can't solve this: We only have a base case for n == 0, but negative numbers never reach 0 by subtracting 1.
What we need: Additional base cases or input validation to handle negative numbers.
Iteration 2: Handling Negative Numbers
Let's fix this by adding proper base cases:
# Iteration 2: Handle negative numbers
def is_even_v2(n):
"""
Check if n is even, handling negative numbers.
Strategy: use absolute value to avoid infinite recursion.
"""
if n < 0:
n = -n # Convert to positive
if n == 0:
return True
return is_odd_v2(n - 1)
def is_odd_v2(n):
"""
Check if n is odd, handling negative numbers.
Strategy: use absolute value to avoid infinite recursion.
"""
if n < 0:
n = -n # Convert to positive
return is_even_v2(n - 1)
# Test with negative numbers
print("\nTesting improved version with negative numbers:")
test_values = [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5]
for n in test_values:
print(f"is_even_v2({n:2}) = {is_even_v2(n)}, is_odd_v2({n:2}) = {is_odd_v2(n)}")
Python Output:
Testing improved version with negative numbers:
is_even_v2(-5) = False, is_odd_v2(-5) = True
is_even_v2(-4) = True, is_odd_v2(-4) = False
is_even_v2(-3) = False, is_odd_v2(-3) = True
is_even_v2(-2) = True, is_odd_v2(-2) = False
is_even_v2(-1) = False, is_odd_v2(-1) = True
is_even_v2( 0) = True, is_odd_v2( 0) = False
is_even_v2( 1) = False, is_odd_v2( 1) = True
is_even_v2( 2) = True, is_odd_v2( 2) = False
is_even_v2( 3) = False, is_odd_v2( 3) = True
is_even_v2( 4) = True, is_odd_v2( 4) = False
is_even_v2( 5) = False, is_odd_v2( 5) = True
Perfect! Now it handles negative numbers correctly.
Alternative Approach: Multiple Base Cases
Instead of converting to absolute value, we could add base cases for negative numbers:
# Alternative: Multiple base cases
def is_even_v3(n):
"""
Check if n is even using multiple base cases.
Base cases: 0 is even, 1 is not even.
"""
if n == 0:
return True
if n == 1:
return False
if n < 0:
return is_even_v3(-n) # Negative numbers: check absolute value
return is_odd_v3(n - 1)
def is_odd_v3(n):
"""
Check if n is odd using multiple base cases.
Base cases: 0 is not odd, 1 is odd.
"""
if n == 0:
return False
if n == 1:
return True
if n < 0:
return is_odd_v3(-n) # Negative numbers: check absolute value
return is_even_v3(n - 1)
# Test alternative approach
print("\nTesting alternative approach with multiple base cases:")
for n in [-3, -2, -1, 0, 1, 2, 3]:
print(f"is_even_v3({n:2}) = {is_even_v3(n)}, is_odd_v3({n:2}) = {is_odd_v3(n)}")
Python Output:
Testing alternative approach with multiple base cases:
is_even_v3(-3) = False, is_odd_v3(-3) = True
is_even_v3(-2) = True, is_odd_v3(-2) = False
is_even_v3(-1) = False, is_odd_v3(-1) = True
is_even_v3( 0) = True, is_odd_v3( 0) = False
is_even_v3( 1) = False, is_odd_v3( 1) = True
is_even_v3( 2) = True, is_odd_v3( 2) = False
is_even_v3( 3) = False, is_odd_v3( 3) = True
Comparing Approaches
Approach 1 (convert to absolute value at start): - Pros: Simple, handles negatives early - Cons: Extra check on every call
Approach 2 (multiple base cases): - Pros: More explicit base cases, clearer logic - Cons: More code, base cases in both functions
Both work correctly. Choose based on clarity preference.
Generalizing: Mutual Recursion for Modular Arithmetic
The even/odd pattern generalizes to checking divisibility by any number. Let's implement "divisible by 3" using mutual recursion:
# Generalization: Divisible by 3 using mutual recursion
def divisible_by_3(n):
"""
Check if n is divisible by 3 using mutual recursion.
Base cases: 0 is divisible by 3, 1 and 2 are not.
"""
if n < 0:
n = -n
if n == 0:
return True
if n == 1 or n == 2:
return False
# n is divisible by 3 if n-3 is divisible by 3
return divisible_by_3(n - 3)
def not_divisible_by_3(n):
"""
Check if n is NOT divisible by 3 using mutual recursion.
This is just the negation of divisible_by_3.
"""
return not divisible_by_3(n)
# Test divisibility by 3
print("\nTesting divisibility by 3:")
for n in range(15):
div3 = divisible_by_3(n)
print(f"divisible_by_3({n:2}) = {div3}")
Python Output:
Testing divisibility by 3:
divisible_by_3( 0) = True
divisible_by_3( 1) = False
divisible_by_3( 2) = False
divisible_by_3( 3) = True
divisible_by_3( 4) = False
divisible_by_3( 5) = False
divisible_by_3( 6) = True
divisible_by_3( 7) = False
divisible_by_3( 8) = False
divisible_by_3( 9) = True
divisible_by_3(10) = False
divisible_by_3(11) = False
divisible_by_3(12) = True
divisible_by_3(13) = False
divisible_by_3(14) = False
The Pattern: Mutual Recursion for Cyclic Properties
The even/odd example demonstrates a general pattern: mutual recursion for cyclic properties.
When to use this pattern: - Property has cyclic nature (even/odd, divisibility) - States alternate in a predictable way - Each state can be defined in terms of the other
The structure:
def state_A(n):
if base_case_A:
return result_A
return state_B(n - step)
def state_B(n):
if base_case_B:
return result_B
return state_A(n - step)
Real-world applications: - Parity checking (even/odd) - State machines with two states - Alternating player turns in games - Ping-pong protocols
Complexity Analysis
Time Complexity: O(n) - Each call decrements n by 1 (or by step size) - Total calls: n / step - For even/odd: O(n) calls
Space Complexity: O(n) - Call stack depth: n / step - Each call stores constant data - For even/odd: O(n) stack frames
Why this is inefficient: For even/odd checking, n % 2 is O(1) time and O(1) space. The recursive approach is purely pedagogicalβit demonstrates mutual recursion clearly but is not practical.
When mutual recursion IS practical: - Parsing (expression evaluation, tokenizing) - State machines (protocol handling) - Game AI (minimax with alternating players) - Grammar-based processing
In these cases, mutual recursion provides clarity and structure that outweighs the overhead.
Advanced: Recursive Descent Parsing Basics
Recursive Descent Parsing: The Professional Application
Now we'll apply everything we've learned to build a recursive descent parserβa real-world technique used in compilers and interpreters. This is where mutual recursion truly shines.
What is Recursive Descent Parsing?
Recursive descent parsing is a top-down parsing technique where: 1. Each grammar rule becomes a function 2. Functions call each other based on grammar structure 3. The parser "descends" through the grammar recursively
This is the technique used in many production compilers (Python's own parser uses a variant of this).
The Reference Problem: Boolean Expression Evaluator
Let's build an evaluator for boolean expressions with:
- Literals: true, false
- Operators: and, or, not
- Parentheses: (, )
- Precedence: not > and > or
Valid expressions:
- true and false β False
- not true or false β False
- (true or false) and true β True
- not (true and false) β True
The Grammar
First, we define the grammar formally:
expression = or_expr
or_expr = and_expr ('or' and_expr)*
and_expr = not_expr ('and' not_expr)*
not_expr = 'not' not_expr | primary
primary = 'true' | 'false' | '(' expression ')'
Notice the structure:
- Lower precedence (or) at the top
- Higher precedence (and) called by lower
- Highest precedence (not) called by higher
- primary handles base cases and parentheses
This grammar naturally maps to mutually recursive functions.
Iteration 1: The Complete Parser
# Iteration 1: Boolean expression parser with recursive descent
class BooleanParser:
def __init__(self, expr):
self.tokens = self._tokenize(expr)
self.index = 0
def _tokenize(self, expr):
"""Convert expression string to list of tokens."""
tokens = []
i = 0
while i < len(expr):
# Skip whitespace
if expr[i].isspace():
i += 1
continue
# Multi-character tokens
if expr[i:i+5] == 'false':
tokens.append(('BOOL', False))
i += 5
elif expr[i:i+4] == 'true':
tokens.append(('BOOL', True))
i += 4
elif expr[i:i+3] == 'and':
tokens.append(('AND', 'and'))
i += 3
elif expr[i:i+3] == 'not':
tokens.append(('NOT', 'not'))
i += 3
elif expr[i:i+2] == 'or':
tokens.append(('OR', 'or'))
i += 2
elif expr[i] == '(':
tokens.append(('LPAREN', '('))
i += 1
elif expr[i] == ')':
tokens.append(('RPAREN', ')'))
i += 1
else:
raise ValueError(f"Unexpected character: '{expr[i]}'")
return tokens
def parse(self):
"""Entry point: parse the entire expression."""
result = self.parse_expression()
if self.index < len(self.tokens):
raise ValueError(f"Unexpected token: {self.tokens[self.index]}")
return result
def parse_expression(self):
"""
Parse an expression (lowest precedence: or).
Grammar: expression = or_expr
"""
return self.parse_or_expr()
def parse_or_expr(self):
"""
Parse OR expression (lowest precedence).
Grammar: or_expr = and_expr ('or' and_expr)*
Mutual recursion: calls parse_and_expr()
"""
# Get first operand (higher precedence)
result = self.parse_and_expr()
# Process any 'or' operators
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
self.index += 1 # Skip 'or'
right = self.parse_and_expr()
result = result or right
return result
def parse_and_expr(self):
"""
Parse AND expression (higher precedence).
Grammar: and_expr = not_expr ('and' not_expr)*
Mutual recursion: calls parse_not_expr()
"""
# Get first operand (higher precedence)
result = self.parse_not_expr()
# Process any 'and' operators
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
self.index += 1 # Skip 'and'
right = self.parse_not_expr()
result = result and right
return result
def parse_not_expr(self):
"""
Parse NOT expression (highest precedence).
Grammar: not_expr = 'not' not_expr | primary
Mutual recursion: calls parse_primary() or itself
"""
# Check for 'not' operator
if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
self.index += 1 # Skip 'not'
# Recursively parse the operand
return not self.parse_not_expr()
# Otherwise, parse primary
return self.parse_primary()
def parse_primary(self):
"""
Parse primary expression (base case).
Grammar: primary = 'true' | 'false' | '(' expression ')'
Mutual recursion: calls parse_expression() for parentheses
"""
if self.index >= len(self.tokens):
raise ValueError("Unexpected end of expression")
token_type, value = self.tokens[self.index]
# Boolean literal
if token_type == 'BOOL':
self.index += 1
return value
# Parenthesized expression
if token_type == 'LPAREN':
self.index += 1 # Skip '('
# Recursively parse the expression inside
result = self.parse_expression()
if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
raise ValueError("Expected ')'")
self.index += 1 # Skip ')'
return result
raise ValueError(f"Unexpected token: {token_type}")
def evaluate_boolean(expr):
"""Convenience function to parse and evaluate a boolean expression."""
parser = BooleanParser(expr)
return parser.parse()
# Test the boolean parser
print("Testing boolean expression parser:")
test_cases = [
("true", True),
("false", False),
("true and false", False),
("true or false", True),
("not true", False),
("not false", True),
("true and true and true", True),
("true or false and false", True), # or has lower precedence
("(true or false) and false", False), # parentheses override
("not true or false", False), # not has highest precedence
("not (true or false)", False),
("true and not false", True),
]
for expr, expected in test_cases:
result = evaluate_boolean(expr)
status = "β" if result == expected else "β"
print(f"{status} evaluate_boolean('{expr}') = {result} (expected {expected})")
Python Output:
Testing boolean expression parser:
β evaluate_boolean('true') = True (expected True)
β evaluate_boolean('false') = False (expected False)
β evaluate_boolean('true and false') = False (expected False)
β evaluate_boolean('true or false') = True (expected True)
β evaluate_boolean('not true') = False (expected False)
β evaluate_boolean('not false') = True (expected True)
β evaluate_boolean('true and true and true') = True (expected True)
β evaluate_boolean('true or false and false') = True (expected True)
β evaluate_boolean('(true or false) and false') = False (expected False)
β evaluate_boolean('not true or false') = False (expected False)
β evaluate_boolean('not (true or false)') = False (expected False)
β evaluate_boolean('true and not false') = True (expected True)
Perfect! Let's trace a complex expression to see the mutual recursion in action.
Tracing Execution: "not (true or false)"
Let's trace this expression step by step:
# Add tracing to visualize the parsing
class TracedBooleanParser(BooleanParser):
def __init__(self, expr):
super().__init__(expr)
self.depth = 0
def _indent(self):
return " " * self.depth
def parse(self):
print(f"\n{'='*60}")
print(f"Parsing: '{' '.join(t[1] if isinstance(t[1], str) else str(t[1]) for t in self.tokens)}'")
print(f"{'='*60}")
result = self.parse_expression()
print(f"\n{'='*60}")
print(f"Final result: {result}")
print(f"{'='*60}")
return result
def parse_expression(self):
self.depth += 1
print(f"{self._indent()}β parse_expression()")
result = self.parse_or_expr()
print(f"{self._indent()}β parse_expression() returns {result}")
self.depth -= 1
return result
def parse_or_expr(self):
self.depth += 1
print(f"{self._indent()}β parse_or_expr()")
result = self.parse_and_expr()
print(f"{self._indent()} Got and_expr: {result}")
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
print(f"{self._indent()} Found 'or'")
self.index += 1
right = self.parse_and_expr()
print(f"{self._indent()} Got and_expr: {right}")
result = result or right
print(f"{self._indent()} Result: {result}")
print(f"{self._indent()}β parse_or_expr() returns {result}")
self.depth -= 1
return result
def parse_and_expr(self):
self.depth += 1
print(f"{self._indent()}β parse_and_expr()")
result = self.parse_not_expr()
print(f"{self._indent()} Got not_expr: {result}")
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
print(f"{self._indent()} Found 'and'")
self.index += 1
right = self.parse_not_expr()
print(f"{self._indent()} Got not_expr: {right}")
result = result and right
print(f"{self._indent()} Result: {result}")
print(f"{self._indent()}β parse_and_expr() returns {result}")
self.depth -= 1
return result
def parse_not_expr(self):
self.depth += 1
print(f"{self._indent()}β parse_not_expr()")
if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
print(f"{self._indent()} Found 'not'")
self.index += 1
result = not self.parse_not_expr()
print(f"{self._indent()}β parse_not_expr() returns {result}")
self.depth -= 1
return result
result = self.parse_primary()
print(f"{self._indent()}β parse_not_expr() returns {result}")
self.depth -= 1
return result
def parse_primary(self):
self.depth += 1
token_type, value = self.tokens[self.index]
print(f"{self._indent()}β parse_primary() sees {token_type}")
if token_type == 'BOOL':
self.index += 1
print(f"{self._indent()}β parse_primary() returns {value}")
self.depth -= 1
return value
if token_type == 'LPAREN':
print(f"{self._indent()} Found '(', parsing sub-expression")
self.index += 1
result = self.parse_expression()
self.index += 1 # Skip ')'
print(f"{self._indent()}β parse_primary() returns {result}")
self.depth -= 1
return result
raise ValueError(f"Unexpected token: {token_type}")
# Trace a complex expression
parser = TracedBooleanParser("not (true or false)")
result = parser.parse()
Python Output:
============================================================
Parsing: 'not ( True or False )'
============================================================
β parse_expression()
β parse_or_expr()
β parse_and_expr()
β parse_not_expr()
Found 'not'
β parse_not_expr()
β parse_primary() sees LPAREN
Found '(', parsing sub-expression
β parse_expression()
β parse_or_expr()
β parse_and_expr()
β parse_not_expr()
β parse_primary() sees BOOL
β parse_primary() returns True
β parse_not_expr() returns True
β parse_and_expr() returns True
Got and_expr: True
Found 'or'
β parse_and_expr()
β parse_not_expr()
β parse_primary() sees BOOL
β parse_primary() returns False
β parse_not_expr() returns False
β parse_and_expr() returns False
Got and_expr: False
Result: True
β parse_or_expr() returns True
β parse_expression() returns True
β parse_primary() returns True
β parse_not_expr() returns True
β parse_not_expr() returns False
β parse_and_expr() returns False
β parse_or_expr() returns False
β parse_expression() returns False
============================================================
Final result: False
============================================================
Understanding the Mutual Recursion Flow
Let's parse this trace section by section:
1. Entry: parse_expression() β parse_or_expr() β parse_and_expr() β parse_not_expr()
2. NOT operator found: parse_not_expr() sees not and recursively calls itself
3. Parenthesis found: The recursive parse_not_expr() calls parse_primary(), which sees ( and calls parse_expression() to parse the sub-expression
4. Sub-expression parsing: Inside the parentheses, true or false is parsed:
- parse_or_expr() gets first operand: true
- Sees or operator
- Gets second operand: false
- Computes true or false = true
5. Return from parentheses: parse_primary() returns true to parse_not_expr()
6. Apply NOT: parse_not_expr() computes not true = false
7. Final result: false
The Mutual Recursion Cycle
Here's the call pattern for not (true or false):
parse_expression()
β
parse_or_expr()
β
parse_and_expr()
β
parse_not_expr() [sees 'not']
β
parse_not_expr() [recursive call]
β
parse_primary() [sees '(']
β
parse_expression() [for sub-expression]
β
parse_or_expr() [processes 'true or false']
β
parse_and_expr() [gets 'true']
β
parse_not_expr()
β
parse_primary() [returns 'true']
β
[returns 'true']
β
[sees 'or']
β
parse_and_expr() [gets 'false']
β
parse_not_expr()
β
parse_primary() [returns 'false']
β
[computes 'true or false' = 'true']
β
[returns 'true' to parse_primary()]
β
[returns 'true' to parse_not_expr()]
β
[computes 'not true' = 'false']
β
[returns 'false']
The key insight: The parser naturally handles precedence and parentheses through the mutual recursion structure. Lower precedence functions call higher precedence functions, and parentheses restart the cycle from the top.
Extending the Parser: Adding Variables
Let's extend our parser to support variables:
# Iteration 2: Extended parser with variables
class ExtendedBooleanParser(BooleanParser):
def __init__(self, expr, variables=None):
super().__init__(expr)
self.variables = variables or {}
def _tokenize(self, expr):
"""Override to handle variable names."""
tokens = []
i = 0
while i < len(expr):
if expr[i].isspace():
i += 1
continue
# Check for keywords first
if expr[i:i+5] == 'false':
tokens.append(('BOOL', False))
i += 5
elif expr[i:i+4] == 'true':
tokens.append(('BOOL', True))
i += 4
elif expr[i:i+3] == 'and':
tokens.append(('AND', 'and'))
i += 3
elif expr[i:i+3] == 'not':
tokens.append(('NOT', 'not'))
i += 3
elif expr[i:i+2] == 'or':
tokens.append(('OR', 'or'))
i += 2
elif expr[i] == '(':
tokens.append(('LPAREN', '('))
i += 1
elif expr[i] == ')':
tokens.append(('RPAREN', ')'))
i += 1
# Handle variable names (letters)
elif expr[i].isalpha():
start = i
while i < len(expr) and expr[i].isalnum():
i += 1
var_name = expr[start:i]
tokens.append(('VAR', var_name))
else:
raise ValueError(f"Unexpected character: '{expr[i]}'")
return tokens
def parse_primary(self):
"""Override to handle variables."""
if self.index >= len(self.tokens):
raise ValueError("Unexpected end of expression")
token_type, value = self.tokens[self.index]
if token_type == 'BOOL':
self.index += 1
return value
# Handle variables
if token_type == 'VAR':
self.index += 1
if value not in self.variables:
raise ValueError(f"Undefined variable: {value}")
return self.variables[value]
if token_type == 'LPAREN':
self.index += 1
result = self.parse_expression()
if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
raise ValueError("Expected ')'")
self.index += 1
return result
raise ValueError(f"Unexpected token: {token_type}")
def evaluate_with_vars(expr, variables):
"""Evaluate boolean expression with variables."""
parser = ExtendedBooleanParser(expr, variables)
return parser.parse()
# Test with variables
print("\nTesting parser with variables:")
variables = {'x': True, 'y': False, 'z': True}
test_cases = [
("x", True),
("y", False),
("x and y", False),
("x or y", True),
("not x", False),
("x and (y or z)", True),
("(x or y) and not z", False),
]
print(f"Variables: {variables}\n")
for expr, expected in test_cases:
result = evaluate_with_vars(expr, variables)
status = "β" if result == expected else "β"
print(f"{status} evaluate('{expr}') = {result} (expected {expected})")
Python Output:
Testing parser with variables:
Variables: {'x': True, 'y': False, 'z': True}
β evaluate('x') = True (expected True)
β evaluate('y') = False (expected False)
β evaluate('x and y') = False (expected False)
β evaluate('x or y') = True (expected True)
β evaluate('not x') = False (expected False)
β evaluate('x and (y or z)') = True (expected True)
β evaluate('(x or y) and not z') = False (expected False)
Perfect! We've built a complete boolean expression evaluator with variables using recursive descent parsing.
The Journey: From Problem to Solution
| Iteration | Feature Added | Technique Applied | Result |
|---|---|---|---|
| 0 | Basic arithmetic parser | Mutual recursion for precedence | Handles +, -, *, / correctly |
| 1 | Boolean expression parser | Recursive descent parsing | Handles and, or, not with precedence |
| 2 | Variable support | Extended parse_primary() | Evaluates expressions with variables |
When to Apply Recursive Descent Parsing
What it optimizes for: - Natural mapping from grammar to code - Clear precedence handling - Easy to extend with new operators - Readable and maintainable
What it sacrifices: - Deeper call stack (one function per precedence level) - More code than table-driven parsers - Not suitable for ambiguous grammars
When to choose this approach: - Building compilers or interpreters - Parsing configuration files - Evaluating expressions - Implementing domain-specific languages (DSLs)
When to avoid this approach: - Grammar is ambiguous or left-recursive - Need maximum performance (use table-driven parser) - Grammar is very simple (regex might suffice)
Complexity characteristics: - Time complexity: O(n) where n is input length (each token processed once) - Space complexity: O(d) where d is maximum nesting depth (call stack) - Call stack depth: O(d) where d is maximum expression nesting
Real-World Applications
Recursive descent parsing is used in:
- Programming language compilers: Python, Go, Rust use variants of this technique
- Configuration parsers: YAML, TOML parsers
- Query languages: SQL parsers, GraphQL parsers
- Markup languages: Markdown parsers, HTML parsers
- Domain-specific languages: Custom scripting languages
The pattern we've learnedβmutual recursion following grammar structureβis the foundation of modern compiler design.
Synthesis and Mastery
The Complete Picture: Mutual Recursion Patterns
We've explored three major applications of mutual recursion:
1. Expression Evaluation (Precedence Handling)
Pattern:
def low_precedence():
result = high_precedence()
while operator_at_this_level():
result = combine(result, high_precedence())
return result
def high_precedence():
# Either call even higher precedence, or parse base case
pass
Key insight: Lower precedence calls higher precedence to get operands.
Applications: - Arithmetic expression evaluators - Boolean expression evaluators - Query language parsers
2. State Machine Tokenizing
Pattern:
def state_start():
if condition_A:
state_A()
elif condition_B:
state_B()
# ...
def state_A():
# Process state A
state_start() # Return to dispatcher
def state_B():
# Process state B
state_start() # Return to dispatcher
Key insight: Each state is a function; state transitions are function calls.
Applications: - Lexical analysis (tokenizing) - Protocol parsing - Text processing with modes
3. Cyclic Property Checking
Pattern:
def property_A(n):
if base_case_A:
return result_A
return property_B(n - step)
def property_B(n):
if base_case_B:
return result_B
return property_A(n - step)
Key insight: Properties defined in terms of each other.
Applications: - Parity checking (even/odd) - Game state evaluation (alternating players) - Mutual dependencies
Decision Framework: When to Use Mutual Recursion
Choose Mutual Recursion When:
β Problem has natural hierarchical structure (grammar, precedence) β States are interdependent (each defined in terms of others) β Clear separation of concerns (each function handles one level/state) β Extensibility matters (easy to add new levels/states) β Readability is priority (matches conceptual model)
Avoid Mutual Recursion When:
β Simple iteration suffices (linear processing) β Stack depth is concern (very deep nesting) β Performance is critical (table-driven approach faster) β Problem doesn't have clear states (forced abstraction)
Comparison with Alternatives
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Mutual Recursion | Clear structure, matches grammar, extensible | Deeper stack, more code | Parsing, state machines |
| Single Recursive Function | Simpler, shallower stack | Tangled logic, hard to extend | Simple recursive problems |
| Iteration | Fastest, no stack overhead | Less clear for hierarchical problems | Linear processing |
| Table-Driven | Very fast, data-driven | Complex setup, less readable | Performance-critical parsing |
Common Failure Modes and Their Signatures
Symptom: RecursionError with Mutual Recursion
Python output pattern:
RecursionError: maximum recursion depth exceeded
Diagnostic clues: - Call stack shows alternating function names - No base case reached - Parameters not progressing toward base case
Root cause: Missing or incorrect base case in one of the mutually recursive functions
Solution: Add base cases to all functions in the cycle
Symptom: Incorrect Precedence in Expression Parser
Python output pattern:
evaluate("2 + 3 * 4") = 20 # Expected: 14
Diagnostic clues: - Left-to-right evaluation instead of precedence - All operators treated equally
Root cause: Not using mutual recursion for precedence levels
Solution: Create separate functions for each precedence level, with lower precedence calling higher precedence
Symptom: Infinite Loop in State Machine
Python output pattern:
[Program hangs, no output]
Diagnostic clues: - State functions call each other without advancing input - No progress toward termination
Root cause: State transitions don't consume input or advance state
Solution: Ensure each state transition advances the input index or changes state meaningfully
Debugging Workflow: When Your Mutually Recursive Functions Fail
Step 1: Identify the Cycle
Question: Which functions call each other?
Action: Draw the call graph:
function_A β function_B β function_C β function_A
Step 2: Trace One Complete Cycle
Question: What happens in one full cycle through all functions?
Action: Add print statements at entry/exit of each function:
def function_A(n):
print(f"β function_A({n})")
result = function_B(n - 1)
print(f"β function_A({n}) returns {result}")
return result
Step 3: Verify Base Cases
Question: Does each function have a base case?
Action: Check that every function in the cycle can terminate without calling another function in the cycle.
Step 4: Verify Progress Toward Base Case
Question: Do parameters move toward base case with each call?
Action: Trace parameter values through the cycle. Ensure they approach the base case condition.
Step 5: Check State Transitions
Question: For state machines, does each state advance the input?
Action: Verify that self.index or equivalent advances in each state function.
Step 6: Validate Grammar Structure
Question: For parsers, does the grammar match the code structure?
Action: Compare grammar rules to function structure. Each grammar rule should map to a function.
Complexity Analysis: The Cost of Mutual Recursion
Time Complexity
For most mutual recursion patterns: - Expression parsing: O(n) where n is input length (each token processed once) - State machine tokenizing: O(n) where n is input length (each character processed once) - Cyclic property checking: O(n) where n is the parameter value (decrements to base case)
Key insight: Time complexity is usually linear in input size, despite the mutual recursion.
Space Complexity
The critical factor is call stack depth: - Expression parsing: O(d) where d is maximum nesting depth (parentheses) - State machine tokenizing: O(s) where s is number of states (typically O(1) since states return quickly) - Cyclic property checking: O(n) where n is the parameter value (one call per decrement)
Key insight: Space complexity depends on maximum recursion depth, not total number of calls.
Recurrence Relations
For expression parsing with precedence:
T(n) = T(n/k) + O(1) where k is number of precedence levels
This solves to O(n) by the Master Theorem.
For cyclic property checking:
T(n) = T(n-1) + O(1)
This solves to O(n) by telescoping.
Final Implementation: Production-Ready Boolean Parser
Here's a complete, production-ready boolean expression parser with all best practices:
# Production-ready boolean expression parser
class ProductionBooleanParser:
"""
A robust boolean expression parser with:
- Proper error messages
- Variable support
- Operator precedence (not > and > or)
- Parentheses support
- Input validation
"""
def __init__(self, expr, variables=None):
self.expr = expr
self.variables = variables or {}
self.tokens = []
self.index = 0
def parse(self):
"""Parse and evaluate the boolean expression."""
try:
self._tokenize()
result = self._parse_expression()
if self.index < len(self.tokens):
token = self.tokens[self.index]
raise ValueError(
f"Unexpected token '{token[1]}' at position {self._get_position(self.index)}"
)
return result
except ValueError as e:
raise ValueError(f"Parse error in '{self.expr}': {e}")
def _tokenize(self):
"""Convert expression string to list of tokens."""
self.tokens = []
i = 0
while i < len(self.expr):
# Skip whitespace
if self.expr[i].isspace():
i += 1
continue
# Multi-character tokens (check longest first)
if self.expr[i:i+5] == 'false':
self.tokens.append(('BOOL', False, i))
i += 5
elif self.expr[i:i+4] == 'true':
self.tokens.append(('BOOL', True, i))
i += 4
elif self.expr[i:i+3] == 'and':
self.tokens.append(('AND', 'and', i))
i += 3
elif self.expr[i:i+3] == 'not':
self.tokens.append(('NOT', 'not', i))
i += 3
elif self.expr[i:i+2] == 'or':
self.tokens.append(('OR', 'or', i))
i += 2
elif self.expr[i] == '(':
self.tokens.append(('LPAREN', '(', i))
i += 1
elif self.expr[i] == ')':
self.tokens.append(('RPAREN', ')', i))
i += 1
# Variable names
elif self.expr[i].isalpha():
start = i
while i < len(self.expr) and self.expr[i].isalnum():
i += 1
var_name = self.expr[start:i]
self.tokens.append(('VAR', var_name, start))
else:
raise ValueError(
f"Unexpected character '{self.expr[i]}' at position {i}"
)
def _get_position(self, token_index):
"""Get the position in the original expression for error messages."""
if token_index < len(self.tokens):
return self.tokens[token_index][2]
return len(self.expr)
def _parse_expression(self):
"""Parse expression (entry point)."""
return self._parse_or_expr()
def _parse_or_expr(self):
"""Parse OR expression (lowest precedence)."""
result = self._parse_and_expr()
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'OR':
self.index += 1
right = self._parse_and_expr()
result = result or right
return result
def _parse_and_expr(self):
"""Parse AND expression (higher precedence)."""
result = self._parse_not_expr()
while self.index < len(self.tokens) and self.tokens[self.index][0] == 'AND':
self.index += 1
right = self._parse_not_expr()
result = result and right
return result
def _parse_not_expr(self):
"""Parse NOT expression (highest precedence)."""
if self.index < len(self.tokens) and self.tokens[self.index][0] == 'NOT':
self.index += 1
return not self._parse_not_expr()
return self._parse_primary()
def _parse_primary(self):
"""Parse primary expression (base case)."""
if self.index >= len(self.tokens):
raise ValueError("Unexpected end of expression")
token_type, value, pos = self.tokens[self.index]
# Boolean literal
if token_type == 'BOOL':
self.index += 1
return value
# Variable
if token_type == 'VAR':
self.index += 1
if value not in self.variables:
raise ValueError(
f"Undefined variable '{value}' at position {pos}"
)
return self.variables[value]
# Parenthesized expression
if token_type == 'LPAREN':
self.index += 1
result = self._parse_expression()
if self.index >= len(self.tokens) or self.tokens[self.index][0] != 'RPAREN':
raise ValueError(
f"Expected ')' at position {self._get_position(self.index)}"
)
self.index += 1
return result
raise ValueError(
f"Expected boolean, variable, or '(' at position {pos}, got '{value}'"
)
def evaluate_boolean_expr(expr, variables=None):
"""
Evaluate a boolean expression with optional variables.
Args:
expr: Boolean expression string
variables: Dictionary of variable names to boolean values
Returns:
Boolean result of the expression
Raises:
ValueError: If expression is invalid or variables are undefined
"""
parser = ProductionBooleanParser(expr, variables)
return parser.parse()
# Comprehensive test suite
print("\nProduction parser test suite:")
print("="*60)
# Test basic operations
print("\n1. Basic operations:")
tests = [
("true", {}, True),
("false", {}, False),
("not true", {}, False),
("true and false", {}, False),
("true or false", {}, True),
]
for expr, vars, expected in tests:
result = evaluate_boolean_expr(expr, vars)
status = "β" if result == expected else "β"
print(f"{status} {expr:20} = {result}")
# Test precedence
print("\n2. Operator precedence:")
tests = [
("not true or false", {}, False), # not has highest precedence
("true or false and false", {}, True), # and before or
("not true and false", {}, False), # not before and
]
for expr, vars, expected in tests:
result = evaluate_boolean_expr(expr, vars)
status = "β" if result == expected else "β"
print(f"{status} {expr:30} = {result}")
# Test parentheses
print("\n3. Parentheses:")
tests = [
("(true or false) and false", {}, False),
("not (true and false)", {}, True),
("(not true) or false", {}, False),
]
for expr, vars, expected in tests:
result = evaluate_boolean_expr(expr, vars)
status = "β" if result == expected else "β"
print(f"{status} {expr:30} = {result}")
# Test variables
print("\n4. Variables:")
vars = {'x': True, 'y': False, 'z': True}
tests = [
("x and y", vars, False),
("x or y", vars, True),
("not x", vars, False),
("x and (y or z)", vars, True),
]
print(f"Variables: {vars}")
for expr, vars, expected in tests:
result = evaluate_boolean_expr(expr, vars)
status = "β" if result == expected else "β"
print(f"{status} {expr:30} = {result}")
# Test error handling
print("\n5. Error handling:")
error_cases = [
("true and", {}),
("(true or false", {}),
("true false", {}),
("x and y", {}), # undefined variables
]
for expr, vars in error_cases:
try:
result = evaluate_boolean_expr(expr, vars)
print(f"β {expr:30} should have raised error")
except ValueError as e:
print(f"β {expr:30} raised: {str(e)[:40]}...")
print("\n" + "="*60)
Python Output:
Production parser test suite:
============================================================
1. Basic operations:
β true = True
β false = False
β not true = False
β true and false = False
β true or false = True
2. Operator precedence:
β not true or false = False
β true or false and false = True
β not true and false = False
3. Parentheses:
β (true or false) and false = False
β not (true and false) = True
β (not true) or false = False
4. Variables:
Variables: {'x': True, 'y': False, 'z': True}
β x and y = False
β x or y = True
β not x = False
β x and (y or z) = True
5. Error handling:
β true and raised: Parse error in 'true and': Unexpec...
β (true or false raised: Parse error in '(true or false': E...
β true false raised: Parse error in 'true false': Unexp...
β x and y raised: Parse error in 'x and y': Undefine...
============================================================
Lessons Learned: The Essence of Mutual Recursion
1. Structure Follows Grammar
When your problem has a formal grammar or hierarchical structure, mutual recursion provides a natural mapping from specification to implementation. Each grammar rule becomes a function.
2. Separation of Concerns
Mutual recursion allows you to separate different aspects of a problem into different functions. Each function handles one level of precedence, one state, or one type of entity.
3. Extensibility Through Modularity
Adding new features (operators, states, token types) is straightforward: add a new function and update the dispatcher. The existing functions remain unchanged.
4. The Cost of Clarity
Mutual recursion trades stack space for code clarity. The call stack grows deeper, but the code structure matches the problem structure, making it easier to understand and maintain.
5. Base Cases Are Critical
In mutual recursion, every function in the cycle must have a path to termination. Missing base cases in any function can cause infinite recursion.
6. Debugging Requires Systematic Tracing
When debugging mutually recursive functions, trace the complete cycle. Understand how functions call each other and how parameters change with each call.
7. Not Always the Best Tool
Mutual recursion is powerful for specific problem types (parsing, state machines), but simple iteration is often clearer for linear processing. Choose the right tool for the problem.
Mastery Checklist
You've mastered mutual recursion when you can:
- β Identify when a problem naturally maps to mutual recursion
- β Design mutually recursive functions from a grammar specification
- β Implement state machines using mutual recursion
- β Handle operator precedence through function hierarchy
- β Debug mutually recursive functions systematically
- β Analyze time and space complexity of mutual recursion
- β Recognize when mutual recursion is overkill
- β Extend mutually recursive systems with new features
- β Write production-ready parsers using recursive descent
- β Explain the trade-offs between mutual recursion and alternatives
Congratulations! You now understand one of the most sophisticated applications of recursion: mutual recursion for parsing and state machines. This knowledge forms the foundation of compiler design, interpreter implementation, and many other advanced programming techniques.